## Loading required package: knitr

require(ggplot2)
require(lattice) # nicer scatter plots
require(plyr)
require(grid) # contains the arrow function
require(biOps)
require(doMC) # for parallel code
require(EBImage)
## To install EBImage
# source("http://bioconductor.org/biocLite.R")
# biocLite("EBImage")

# start parallel environment
registerDoMC()
# functions for converting images back and forth
im.to.df<-function(in.img) {
    out.im<-expand.grid(x=1:nrow(in.img),y=1:ncol(in.img))
    out.im$val<-as.vector(in.img)
    out.im
}
df.to.im<-function(in.df,val.col="val",inv=F) {
  in.vals<-in.df[[val.col]]
  if(class(in.vals[1])=="logical") in.vals<-as.integer(in.vals*255)
  if(inv) in.vals<-255-in.vals
  out.mat<-matrix(in.vals,nrow=length(unique(in.df$x)),byrow=F)
  attr(out.mat,"type")<-"grey"
  out.mat
}
ddply.cutcols<-function(...,cols=1) {
  # run standard ddply command
  cur.table<-ddply(...)
  cutlabel.fixer<-function(oVal) {
    sapply(oVal,function(x) {
      cnv<-as.character(x)
      mean(as.numeric(strsplit(substr(cnv,2,nchar(cnv)-1),",")[[1]]))
    })
  }
  cutname.fixer<-function(c.str) {
    s.str<-strsplit(c.str,"(",fixed=T)[[1]]
    t.str<-strsplit(paste(s.str[c(2:length(s.str))],collapse="("),",")[[1]]
    paste(t.str[c(1:length(t.str)-1)],collapse=",")
  }
  for(i in c(1:cols)) {
    cur.table[,i]<-cutlabel.fixer(cur.table[,i])
    names(cur.table)[i]<-cutname.fixer(names(cur.table)[i])
  }
  cur.table
}

Quantitative Big Imaging

author: Kevin Mader date: 13 March 2014 width: 1440 height: 900 transition: rotate

Advanced Segmentation and Labeling

Course Outline

Literature / Useful References

Lesson Outline

What we covered last time

Where segmentation fails: Inconsistent Illumination

With inconsistent or every changing illumination it may not be possible to apply the same threshold to every image.

cellImage<-im.to.df(jpeg::readJPEG("ext-figures/Cell_Colony.jpg"))
max.il<-2.5
il.vals<-runif(9,min=1/max.il,max=max.il)
im.vals<-ldply(1:length(il.vals),function(il.idx,th.val=0.75)
  cbind(cellImage[,c("x","y")],
        val=cellImage$val*il.vals[il.idx],
        in.thresh=ifelse(cellImage$val*il.vals[il.idx]
Cell Colony with Different Illuminations


ggplot(subset(im.vals,in.thresh=="Cells"),aes(x=x,y=y))+
  geom_raster(fill="red")+facet_wrap(~il.idx)+
  labs(fill="Phase",title="Cell Phase")+
  theme_bw(20)+coord_equal()
Different Illuminations with Constant Threshold

Where segmentation fails: Canaliculi

Bone Slice

Here is a bone slice

  1. Find the larger cellular structures (osteocyte lacunae)
  2. Find the small channels which connect them together

The first task

is easy using a threshold and size criteria (we know how big the cells should be)

The second

is much more difficult because the small channels having radii on the same order of the pixel size are obscured by partial volume effects and noise.

Where segmentation fails: Brain Cortex

alz.df<-im.to.df(t(png::readPNG("ext-figures/cortex.png")))
ggplot(alz.df,aes(x=x,y=518-y))+
  geom_raster(aes(fill=val))+
  labs(fill="Electron Density",y="y",x="x")+
  coord_equal()+
  theme_bw(20)
Brain with Threshold


Automated Threshold Selection

Many possible automated techniques


Given that applying a threshold is such a common and signficant step, there have been many tools developed to automatically (unsupervised) perform it. A particularly important step in setups where images are rarely consistent such as outdoor imaging which has varying lighting (sun, clouds). The methods are based on several basic principles.

Automated Methods

Histogram-based methods

Just like we visually inspect a histogram an algorithm can examine the histogram and find local minimums between two peaks, maximum / minimum entropy and other factors

Image based Methods

These look at the statistics of the thresheld image themselves (like entropy) to estimate the threshold

Results-based Methods

These search for a threshold which delivers the desired results in the final objects. For example if you know you have an image of cells and each cell is between 200-10000 pixels the algorithm runs thresholds until the objects are of the desired size

Fiji -> Adjust -> Auto Threshold

There are many methods and they can be complicated to implement yourself. FIJI offers many of them as built in functions so you can automatically try all of them on your image Many possible automated techniques

Pitfalls

While an incredibly useful tool, there are many potential pitfalls to these automated techniques.

Histogram-based

These methods are very sensitive to the distribution of pixels in your image and may work really well on images with equal amounts of each phase but work horribly on images which have very high amounts of one phase compared to the others

Image-based

These methods are sensitive to noise and a large noise content in the image can change statistics like entropy significantly.

Results-based

These methods are inherently biased by the expectations you have. If you want to find objects between 200 and 1000 pixels you will, they just might not be anything meaningful.

Realistic Approaches for Dealing with these Shortcomings

Imaging science rarely represents the ideal world and will never be 100% perfect. At some point we need to write our master's thesis, defend, or publish a paper. These are approaches for more qualitative assessment we will later cover how to do this a bit more robustly with quantitative approaches

Model-based

One approach is to try and simulate everything (including noise) as well as possible and to apply these techniques to many realizations of the same image and qualitatively keep track of how many of the results accurately identify your phase or not. Hint: >95% seems to convince most biologists

Sample-based

Apply the methods to each sample and keep track of which threshold was used for each one. Go back and apply each threshold to each sample in the image and keep track of how many of them are correct enough to be used for further study.

Worst-case Scenario

Come up with the worst-case scenario (noise, misalignment, etc) and assess how inacceptable the results are. Then try to estimate the quartiles range (75% - 25% of images).

Hysteresis Thresholding

For some images a single threshold does not work

ImageJ Source

bone.df<-im.to.df(png::readPNG("ext-figures/bonegfiltslice.png"))
ggplot(bone.df,aes(x=x,y=y))+
  geom_raster(aes(fill=val))+
  labs(fill="Calcification Dens",y="y",x="x")+
  coord_equal()+
  theme_bw(20)
Bone Slice


ggplot(bone.df,aes(x=val))+
  geom_histogram(aes(y=..density..),alpha=0.5)+
  geom_density()+scale_y_sqrt()+
  labs(x="Calcification Dens")+
  theme_bw(20)
Bone Slice Histogram

thresh.fun<-function(x) {ifelse(x<0.01,"Air",ifelse(x<0.30,"Between","Bone"))}
bone.df$phase<-thresh.fun(bone.df$val)
ggplot(bone.df,aes(x=val))+
  geom_histogram(aes(fill=phase),binwidth=0.02,alpha=0.5)+
  geom_density(aes(y=15000/1.5*..scaled..))+
  labs(x="Calcification Density (au)")+
  scale_y_sqrt()+#(c(0,20))+
  theme_bw(20)
Bone Labeled Histogram

Hysteresis Thresholding

Comparing the original image with the three phases

ggplot(bone.df,aes(x=x,y=y))+
  geom_raster(aes(fill=val))+
  labs(fill="Calcification Dens",y="y",x="x")+
  coord_equal()+
  theme_bw(20)
Bone Slice


ggplot(bone.df,aes(x=x,y=y))+
  geom_raster(aes(fill=phase))+
  labs(fill="Phase",y="y",x="x")+
  coord_equal()+
  theme_bw(20)
Bone Slice

Hysteresis Thresholding: Reducing Pixels

Now we apply two important steps. The first is to remove the objects which are not cells (too small) using an opening operation.

air.im<-df.to.im(cbind(bone.df,isair=bone.df$phase=="Air"),"isair")
air.df<-im.to.df(opening(air.im,makeBrush(5,"disc")))
names(air.df)[3]<-"stillair"
nbone.df<-merge(air.df,bone.df)
ggplot(nbone.df,aes(x=x,y=y))+
  geom_raster(aes(fill=phase,alpha=stillair))+
  labs(fill="Phase",y="y",x="x",title="After Opening")+
  coord_equal()+guides(alpha=F)+
  theme_bw(20)
Bone Slice
# if its air make sure its still air otherwise demote it to between, 
# if its not air leave it alone
nbone.df$phase<-ifelse(nbone.df$phase=="Air",
       ifelse(nbone.df$stillair>0,"Air","Between"),
       nbone.df$phase)


The second step to keep the between pixels which are connected (by looking again at a neighborhood \(\mathcal{N}\)) to the air voxels and ignore the other ones. This goes back to our original supposition that the smaller structures are connected to the larger structures

# incredibly low performance implementation (please do not copy)
bone.idf<-nbone.df
bet.pts<-nbone.df
# run while there is still new air being created
while(nrow(subset(bet.pts,phase=="Air"))>0) {
  air.pts<-subset(bone.idf,phase=="Air")[,c("x","y")]
  bone.pts<-subset(bone.idf,phase=="Bone")[,c("x","y")]
  bet.pts<-ddply(subset(bone.idf,phase=="Between"),.(x,y),function(in.pixel.lst) {
    in.pixel<-in.pixel.lst[1,]
    data.frame(phase=ifelse(min(with(air.pts,(in.pixel$x-x)^2+(in.pixel$y-y)^2))<=1,
         "Air",
         "Between"))
  })
  bone.idf<-rbind(bet.pts,
                  cbind(air.pts,phase="Air"),
                  cbind(bone.pts,phase="Bone"))
  print(nrow(subset(bet.pts,phase=="Air")))
}
ggplot(bone.idf,aes(x=x,y=y))+
  geom_raster(aes(fill=phase))+
  labs(fill="Phase",y="y",x="x")+
  coord_equal()+
  theme_bw(20)
Bone Slice

More Complicated Images

As we briefly covered last time, many measurement techniques produce quite rich data.


nx<-4
ny<-4
n.pi<-4
grad.im<-expand.grid(x=c(-nx:nx)/nx*n.pi*pi,
                     y=c(-ny:ny)/ny*n.pi*pi)

grad.im<-cbind(grad.im,
               col=1.5*with(grad.im,abs(cos(x*y))/(abs(x*y)+(3*pi/nx)))+
                 0.5*runif(nrow(grad.im)),
              a=with(grad.im,sqrt(2/(abs(x)+0.5))),
               b=with(grad.im,0.5*sqrt(abs(x)+1)),
              th=0.5*runif(nrow(grad.im)),
              aiso=1,count=1)

create.ellipse.points<-function(x.off=0,y.off=0,a=1,b=NULL,th.off=0,th.max=2*pi,pts=36,...) {
  if (is.null(b)) b<-a
  th<-seq(0,th.max,length.out=pts)
  data.frame(x=a*cos(th.off)*cos(th)+b*sin(th.off)*sin(th)+x.off,
             y=-1*a*sin(th.off)*cos(th)+b*cos(th.off)*sin(th)+y.off,
             id=as.factor(paste(x.off,y.off,a,b,th.off,pts,sep=":")),...)
}
deform.ellipse.draw<-function(c.box) {
  create.ellipse.points(x.off=c.box$x[1],
                        y.off=c.box$y[1],
                        a=c.box$a[1],
                        b=c.box$b[1],
                        th.off=c.box$th[1],
                        col=c.box$col[1])                    
}

# normalize vector
tens.im<-ddply(grad.im,.(x,y),deform.ellipse.draw)

ggplot(tens.im,aes(x=x,y=y,group=as.factor(id),fill=col))+
  geom_polygon(color="black")+coord_fixed(ratio=1)+scale_fill_gradient(low="black",high="white")+guides(fill=F)+
  theme_bw(20)

K-Means Clustering / Classification

K-Means Algorithm

type:alert We give as an initial parameter the number of groups we want to find and possible a criteria for removing groups that are too similar 1. Randomly create center points (groups) in vector space 1. Assigns group to data point by the “closest” center 1. Recalculate centers from mean point in each group 1. Go back to step 2 until the groups stop changing


What vector space to we have?

Note: If you look for 2 groups you will almost always find two groups, whether or not they make any sense

K-Means Example

objColor=kmeans(indata,2);
grad.im$km.cluster<-kmeans(grad.im,2)$cluster
# normalize vector
tens.im<-ddply(grad.im,.(x,y),function(c.data) cbind(deform.ellipse.draw(c.data),
                                                     km.cluster=c.data[1,"km.cluster"]
                                                     )
               )

ggplot(tens.im,aes(x=x,y=y,group=as.factor(id),fill=as.factor(km.cluster)))+
  geom_polygon(color="black")+coord_fixed(ratio=1)+labs(fill="KMeans\nCluster")+
  theme_bw(20)
KMeans

splom(grad.im[,c("x","y","th","aiso")],groups=grad.im$km.cluster,pch=16)
Variable distributions


Or just the orientation

objColor=kmeans(indata(:,[aisoCol,thCol]),2);
grad.im$km.cluster<-kmeans(grad.im[,c("th")],2)$cluster
# normalize vector
tens.im<-ddply(grad.im,.(x,y),function(c.data) cbind(deform.ellipse.draw(c.data),
                                                     km.cluster=c.data[1,"km.cluster"]
                                                     )
               )
ggplot(tens.im,aes(x=x,y=y,group=as.factor(id),fill=as.factor(km.cluster)))+
  geom_polygon(color="black")+coord_fixed(ratio=1)+labs(fill="KMeans\nCluster")+
  theme_bw(20)
KMeans

splom(grad.im[,c("x","y","th","aiso")],
      groups=grad.im$km.cluster,pch=16)
Variable distributions

K-Means on Spectral Data

type:alert

Probabilistic Models of Segmentation

type:alert A more general approach is to use a probabilistic model to segmentation. We start with our image \(I(\vec{x}) \forall \vec{x}\in \mathbb{R}^N\) and we classify it into two phases \(\alpha\) and \(\beta\)

\[P(\{\vec{x} , I(\vec{x})\} | \alpha) \propto P(\alpha) + P(f(\vec{x}) | \alpha)+ P(\sum_{x^{\prime} \in \mathcal{N}} f(\vec{x^{\prime}}) | \alpha)\]

Fuzzy Classification

Fuzzy classification based on Fuzzy logic and Fuzzy set theory and is a general catagory for multi-value logic instead of simply true and false and can be used to build IF and THEN statements from our probabilistic models.

Instead of

\[P(\{\vec{x} , I(\vec{x})\} | \alpha) \propto P(\alpha) + P(f(\vec{x}) | \alpha)+\] \[P(\sum_{x^{\prime} \in \mathcal{N}} f(\vec{x^{\prime}}) | \alpha)\]


Clear simple rules

which encompass aspects of filtering, thresholding, and morphological operations

Cell Colony

Beyond

Contouring

Expanding on the hole filling issues examined before, a general problem in imaging is identifying regions of interest with in an image.

Contouring: Automatic Methods

As mentioned before a few standard automated methods exist like the

Convex Hull Approach

takes all of the points in a given slice or volume and finds the smallest convex 3D volume which encloses all of those points.

cortbone.im<-imagedata(t(png::readPNG("ext-figures/bone.png")),"grey")
cortbone.df<-im.to.df(cortbone.im)
# calculate convex hull
cortbone.chull<-chull(subset(cortbone.df,val<1)[,c("x","y")])
cortbone.chull<-c(cortbone.chull,cortbone.chull[1])
cortbone.chull<-subset(cortbone.df,val<1)[cortbone.chull,c("x","y")]
ggplot(subset(cortbone.df,val<1),aes(x=x,y=518-y))+
  geom_polygon(data=cortbone.chull,aes(fill="convex hull"),alpha=0.5)+
  geom_raster(aes(fill="original"),alpha=0.8)+
  labs(fill="Image",y="y",x="x",title="Convex Hull Creation")+
  coord_equal()+
  theme_bw(20)
Cortical Segment Convex Hull

Rubber Band

## the rubber band function to fit a boundary around the curve
rubber.band.data<-function(raw.df,binning.pts=10,eval.fun=max) {
  in.df<-raw.df
  # calculate center of mass
  com.val<-colMeans(in.df)
  # add polar coordinates
  in.df$r<-with(in.df,sqrt((x-com.val["x"])^2+(y-com.val["y"])^2))
  in.df$theta<-with(in.df,atan2(y-com.val["y"],x-com.val["x"]))
  # create a maximum path
  outer.path<-ddply.cutcols(in.df,.(cut_interval(theta,binning.pts)),function(c.section) data.frame(r=eval.fun(c.section$r)))
  outer.path$x<-with(outer.path,r*cos(theta)+com.val["x"])
  outer.path$y<-with(outer.path,r*sin(theta)+com.val["y"])
  outer.path
}

Takes all of the points in a given slice and calculates the center of mass. It then calculates a piece-wise linear fit in polar coordinates. which encloses all of those points.

ggplot(subset(cortbone.df,val<1),aes(x=x,y=518-y))+
  geom_polygon(data=rubber.band.data(subset(cortbone.df,val<1),9),aes(fill="rubber band  9pts"),alpha=0.5)+
  geom_polygon(data=rubber.band.data(subset(cortbone.df,val<1),36),aes(fill="rubber band  36pts"),alpha=0.5)+
  geom_raster(aes(fill="original"),alpha=0.8)+
  labs(fill="Image",y="y",x="x")+
  coord_equal()+
  theme_bw(20)
First Rubber Bands


ggplot(subset(cortbone.df,val<1),aes(x=x,y=518-y))+
  geom_polygon(data=rubber.band.data(subset(cortbone.df,val<1),90),aes(fill="rubber band  90pts"),alpha=0.5)+
  geom_polygon(data=rubber.band.data(subset(cortbone.df,val<1),180),aes(fill="rubber band 180pts"),alpha=0.5)+
  geom_raster(aes(fill="original"),alpha=0.8)+
  labs(fill="Image",y="y",x="x")+
  coord_equal()+
  theme_bw(20)
Better Rubber Bands

ggplot(subset(cortbone.df,val<1),aes(x=x,y=518-y))+
  geom_polygon(data=rubber.band.data(subset(cortbone.df,val<1),180),aes(fill="rubber band 180pts"),alpha=0.5)+
  geom_polygon(data=rubber.band.data(subset(cortbone.df,val<1),720),aes(fill="rubber band 720pts"),alpha=0.5)+
  geom_raster(aes(fill="original"),alpha=0.8)+
  labs(fill="Image",y="y",x="x")+
  coord_equal()+xlim(50,150)+ylim(-100,50)+
  theme_bw(20)
In Detail

Rubber Band: More flexible constraints

If we use quartiles or the average instead of the maximum value we can make the method less sensitive to outlier pixels

ggplot(subset(cortbone.df,val<1),aes(x=x,y=518-y))+
  geom_raster(aes(fill="original"),alpha=0.8)+
  geom_polygon(data=rubber.band.data(subset(cortbone.df,val<1),90,eval.fun=mean),
               aes(fill="rubber band mean value"),alpha=0.5)+
  geom_polygon(data=rubber.band.data(subset(cortbone.df,val<1),90,
                                     eval.fun=function(x) quantile(x,0.99)),
               aes(fill="rubber band upper 99%"),alpha=0.5)+

  labs(fill="Image",y="y",x="x")+
  coord_equal()+
  theme_bw(20)
In Detail

Contouring: Guided Methods

type:alert

Many forms of guided methods exist, the most popular is known simply as the Magnetic Lasso in Adobe Photoshop (video).

The basic principal behind many of these methods is to optimize a set of user given points based on local edge-like information in the image. In the brain cortex example, this is the small gradients in the gray values which our eyes naturally seperate out as an edge but which have many gaps and discontinuities.

Active Contours / Snakes

Component Labeling

Once we have a clearly segmented image, it is often helpful to identify the sub-components of this image. The easist method for identifying these subcomponents is called component labeling which again uses the neighborhood \(\mathcal{N}\) as a criterion for connectivity, resulting in pixels which are touching being part of the same object.

In general, the approach works well since usually when different regions are touching, they are related. It runs into issues when you have multiple regions which agglomerate together, for example a continuous pore network (1 object) or a cluster of touching cells


The more general formulation of the problem is for networks (roads, computers, social). Are the points start and finish connected?

nx<-5
ny<-10
cross.im<-expand.grid(x=c(-nx:nx),y=c(-ny:ny))
max.nodes.fn<-function(x,y) 1+round(abs(x+y/2)) # non-homogenous network connectivity
cross.im<-ddply(cross.im,.(x,y), function(in.pt) {
    out.df<-data.frame(xv=c(),yv=c())
    max.nodes<-max.nodes.fn(in.pt[1,"x"],in.pt[1,"y"])
    for(i in 1:round(runif(1,max=max.nodes))) {
      c.angle<-runif(1,max=pi)
      out.df<-rbind(out.df,data.frame(
        xv=round(cos(c.angle)),
        yv=round(sin(c.angle))
        ))
    }
    out.df
  })
id.fn<-function(x,y) (y*100+x)
cross.im$id<-with(cross.im,id.fn(x,y))

cross.lab.im<-cbind(cross.im,label="Normal")
cross.lab.im$label<-as.character(cross.lab.im$label)
cross.lab.im[which(cross.lab.im$id==cross.lab.im[1,]$id),"label"]<-"Begin"
cross.lab.im[which(cross.lab.im$id==cross.lab.im[nrow(cross.lab.im),]$id),"label"]<-"End"
cross.lab.im$label<-as.factor(cross.lab.im$label)
ggplot(cross.lab.im,aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=label),size=5)+geom_segment()+
  theme_bw(20)+labs(color="Goal")
Network connectivity

Component Labeling: Algorithm

We start out with the network and we grow Begin to its connections. In a brushfire-style algorithm


# Network traversing algorithm
iter.grow<-function(ncross.lab.im,iter.cnt=10,source.phase="Begin",target.phase=NA,grow.into.phase="Normal") {
  if(is.na(target.phase)) target.phase<-source.phase
  
  for(iters in 1:iter.cnt) {
    begin.pts<-subset(ncross.lab.im,label==source.phase | label==target.phase)
    begin.ids<-unique(begin.pts$id)
    begin.pts.term<-subset(ncross.lab.im,id.fn(x+xv,y+yv) %in% begin.ids)  
    begin.pts<-rbind(begin.pts,
                     begin.pts.term
                     )
    ncross.lab.im$label<-as.character(ncross.lab.im$label)
    for(i in 1:nrow(begin.pts)) {
      cur.pt<-begin.pts[i,]
      child.id<-id.fn(cur.pt$x+cur.pt$xv,cur.pt$y+cur.pt$yv)
      switch.pixs<-which(ncross.lab.im$id==child.id & 
                            ncross.lab.im$label == grow.into.phase)
      if(length(switch.pixs)>1) ncross.lab.im[switch.pixs ,"label"]<-target.phase
      }
    switch.pixs<-which(ncross.lab.im$id %in% begin.pts.term$id &
                          ncross.lab.im$label == grow.into.phase)
    
    if(length(switch.pixs)>1) ncross.lab.im[switch.pixs,"label"]<-target.phase
    }
  ncross.lab.im$label<-as.factor(ncross.lab.im$label)
  ncross.lab.im
  }

ggplot(iter.grow(cross.lab.im,1,target.phase="Pixels Grown"),aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=label),size=5)+geom_segment()+
  theme_bw(20)+labs(color="Type",title="1 Iteration from Begin")
One Growth Step

Component Labeling: Algorithm Steps

ggplot(iter.grow(cross.lab.im,10,target.phase="Pixels Grown"),aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=label),size=3)+geom_segment()+coord_equal()+
  theme_bw(20)+labs(color="Type",title="10 iterations from Begin")
10 Growth Steps

ggplot(iter.grow(cross.lab.im,10,source.phase="End",target.phase="Pixels Grown"),aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=label),size=3)+geom_segment()+coord_equal()+
  theme_bw(20)+labs(color="Type",title="10 iterations from End")
10 Growth Steps


ggplot(iter.grow(cross.lab.im,round(2*sqrt(20^2+10^2)),target.phase="Pixels Grown"),aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=label),size=5)+geom_segment()+coord_equal()+
  theme_bw(20)+labs(color="Type",title="Diagonal iterations from Begin")
Diagonal Growth Steps

Component Labeling: Images Algorithm

Same as for networks but the neighborhood is defined with a kernel (circular, full, line, etc) and labels must be generate for the image


nx<-10
ny<-5
ang.vals<-seq(0,pi/2,length.out=2)
cross.im<-expand.grid(x=c(-nx:nx),y=c(-ny:ny))
cross.im<-ddply(cross.im,.(x,y), function(in.pt) {
    out.df<-data.frame(xv=c(),yv=c())
    
    for(i in 1:length(ang.vals)) {
      c.angle<-ang.vals[i]
      out.df<-rbind(out.df,data.frame(
        xv=round(cos(c.angle)),
        yv=round(sin(c.angle))
        ))
    }
    out.df
  })

id.fn<-function(x,y) (y*100+x)
cross.im$id<-with(cross.im,id.fn(x,y))
cross.im<-subset(cross.im,x!=0 & y!=0)
ggplot(cross.im,aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=as.factor(id)),size=5)+geom_segment()+
  theme_bw(20)+labs(color="Goal")
Image as Network

Component Labeling: Image Algorithm

# Component labeling algorithm
im.iter.grow<-function(icross.im,iter.cnt=10) {
  ncross.im<-icross.im
  for(iters in 1:iter.cnt) {
    
    begin.pts<-ncross.im
    for(i in 1:nrow(begin.pts)) {
      cur.pt<-begin.pts[i,]
      child.id<-id.fn(cur.pt$x+cur.pt$xv,cur.pt$y+cur.pt$yv)
      # fill into lower points
      switch.pixs<-which(icross.im$id==child.id & 
                            icross.im$id > cur.pt$id)
      if(length(switch.pixs)>1) ncross.im[switch.pixs ,"id"]<-cur.pt$id
      }
    icross.im<-ncross.im
    }
  ncross.im
  }

ggplot(im.iter.grow(cross.im,2),aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=as.factor(id)),size=3)+geom_segment()+coord_equal()+
  theme_bw(20)+labs(color="Type",title="2 Iterations")
One Growth Step

ggplot(im.iter.grow(cross.im,3),aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=as.factor(id)),size=3)+geom_segment()+
  coord_equal()+
  theme_bw(20)+labs(color="Type",title="3 Iterations")
One Growth Step


The image very quickly converges and after 4 iterations the task is complete. For larger more complicated images with thousands of components this task can take longer, but there exist much more efficient algorithms for labeling components which alleviate this issue.

ggplot(im.iter.grow(cross.im,4),aes(x=x,y=y,xend=x+xv,yend=y+yv))+
  geom_point(aes(color=as.factor(id)),size=5)+geom_segment()+coord_equal()+
  theme_bw(20)+labs(color="Type",title="4 iterations")
One Growth Step

Component Labeling: Image Algorithm

In reality the component labeling algorithms are usually implemented, but it is important to understand how they work and their relationship with neighborhood in order to interpret the results correctly.

labImg=bwlabel(imName)

Component Labeling: Image Algorithm

cell.im<-jpeg::readJPEG("ext-figures/Cell_Colony.jpg")
cell.lab.df<-im.to.df(bwlabel(cell.im<.6))
ggplot(subset(cell.lab.df,val>0),aes(x=x,y=y,fill=val))+
  geom_raster()+
  labs(fill="Label",title="Labeled Image")+
  theme_bw(20)+coord_equal()
Cell Colony

size.histogram<-ddply(subset(cell.lab.df,val>0),.(val),function(c.label) data.frame(count=nrow(c.label)))
keep.vals<-subset(size.histogram,count>25)
ggplot(size.histogram,aes(x=count))+
  geom_density(aes(color="Entire Labeled Images"))+
  geom_density(data=keep.vals,aes(color="Larger Cells"))+
  labs(x="Cell Volume",y="Frequency of Cells",title="Size Distribution",color="Distribution")+
  scale_y_sqrt()+
  theme_bw(20)
Cell Colony


ggplot(subset(cell.lab.df,val %in% keep.vals$val),aes(x=x,y=y,fill=as.factor(val)))+
  geom_raster()+
  labs(fill="Intensity",title="Larger Cells (>25px)")+
  theme_bw(20)+coord_equal()
Cell Colony

Component Labeling: Image Algorithm

Now all the voxels which are connected have the same label. We can then perform simple metrics like

Next week we will cover how more detailed analysis can be performed on these data.